Nested quantum annealing correction

ABSTRACT

Systems and methods of processing using a quantum processor are described. A method includes obtaining a problem Hamiltonian and defining a nested Hamiltonian with a plurality of logical qubits by embedding a logical K N  representing the problem Hamiltonian into a larger K C×N , where N represents a number of the logical qubits and C represents a nesting level defining the amount of hardware resources for the nest Hamiltonian. The method also includes encoding the nested Hamiltonian into the plurality of physical qubits of the quantum processor; and performing a quantum annealing process with the quantum processor after the encoding.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to and the benefit of U.S. ProvisionalPatent Application No. 62/350,618, entitled “NESTED QUANTUM ANNEALINGCORRECTION” and filed Jun. 15, 2016, the contents of which are hereinincorporated by reference in their entirety.

STATEMENT REGARDING FEDERALLY FUNDED RESEARCH

This invention was made with government support under grantINSPIRE-1551064 awarded by the National Science Foundation, and grantsW911NF-11-1-0268, W911NF-15-1-0582, and W911NF-12-1-0523 awarded by theArmy Research Office. The government has certain rights in theinvention.

FIELD OF THE INVENTION

The present invention relates to an error-correcting scheme for quantumannealing that allows for the encoding of a logical qubit into anarbitrarily large number of physical qubits.

BACKGROUND

Quantum annealing (QA) attempts to exploit quantum fluctuations to solvecomputational problems faster than it is possible with classicalcomputers. As an approach designed to solve optimization problems, QA isa special case of adiabatic quantum computation (AQC), a universal modelof quantum computing. In AQC, a system is designed to follow theinstantaneous ground state of a time-dependent Hamiltonian whose finalground state encodes the solution to the problem of interest. Thisresults in a certain amount of stability, since the system can thermallyrelax to the ground state after an error, as well as resilience toerrors, since the presence of a finite energy gap suppresses thermal anddynamical excitations.

Despite this inherent robustness to certain forms of noise, AQC requireserror-correction to ensure scalability, just like any other form ofquantum information processing. Various error correction proposals forAQC and QA have been made, but an accuracy-threshold theorem for AQC isnot yet known, unlike in the circuit model. A direct AQC simulation of afault-tolerant quantum circuit leads to many-body (high-weight)operators that are difficult to implement or to a myriad of otherproblems. Nevertheless, a scalable method to reduce the effectivetemperature would go a long way towards approaching the ideal ofclosed-system AQC, where quantum speedups are known to be possible.

SUMMARY

According to an aspect of an exemplary embodiment, a method of nestederror correction for quantum annealing, to improve the performance ofquantum annealers includes defining a nested Hamiltonian by embedding alogical K_(N) into a larger K_(C×N), where N represents a number oflogical qubits and C represents a nesting level and controls an amountof hardware resources used to represent a logical problem, implementingthe nested Hamiltonian on a quantum annealing hardware, with alower-degree qubit connectivity graph, and measuring a plurality ofphysical qubits.

According to another exemplary embodiment, the method further includesrecovering a logical state using a decoding procedure

According to another exemplary embodiment, the implementing furthercomprises minor embedding which includes replacing each qubit in thenested Hamiltonian by a ferromagnetically coupled chain of qubits, suchthat all couplings in the nested Hamiltonian are represented byinter-chain couplings.

According to another exemplary embodiment, the decoding procedure isperformed over both a length (L) chain of each encoded qubit and Cencoded qubits comprising each logical qubit.

According to another exemplary embodiment, a number of physical qubitsnecessary for the minor embedding of the K_(C×N) is N_(C,Phys)=CNL˜C²N².

According to another exemplary embodiment, the hardware resourcescomprise at least one of qubits, couplers and local fields.

According to another exemplary embodiment, each logical cubit i (i=1, .. . , N) is represented by a C-tuple of encoded qubits (i, c), with c=1,. . . , C.

According to another exemplary embodiment, the hardware resourcescomprise nested couplers {tilde over (J)}_((i,c)(j,c′)) and local fields{tilde over (h)}_((i,c)) where

-   -   {tilde over (J)}_((i,c)(j,c′))=J_(ij), ∀c,c′, i≠j,    -   {tilde over (h)}_((i,c)) Ch_(i), ∀c,i,    -   {tilde over (J)}_((i,c)(j,c′))=−γ, ∀c≠c′.

According to another exemplary embodiment, a processing system includesa digital computer having a digital processor and a memory having storedthereon instructions for causing the digital processor to: obtain aproblem Hamiltonian and define a nested Hamiltonian with a plurality oflogical qubits by embedding a logical K_(N) representing the problemHamiltonian into a larger K_(C×N), where N represents a number of thelogical qubits and C represents a nesting level defining the amount ofhardware resources for the nest Hamiltonian. The processing system alsoincludes an analog computer coupled to the digital computer, the analogcomputer including a quantum processor and configured for encoding thenested Hamiltonian into a plurality of physical qubits of the quantumprocessor, and performing a quantum annealing process with the quantumprocessor after the encoding.

In another exemplary embodiment, the analog computer is configured formeasuring the plurality of physical qubits, and where the instructionsfurther comprise instructions for causing the digital processor torecover a logical state of each of the plurality qubits using a decodingprocedure.

In another exemplary embodiment, the encoding further includesperforming a minor embedding process comprising replacing each ofplurality of logical qubits in the nested Hamiltonian by aferromagnetically coupled chain of qubits, such that all couplings inthe nested Hamiltonian are represented by inter-chain couplings.

In another exemplary embodiment, a number of physical qubits necessaryfor the minor embedding of the K_(C×N) is N_(C,Phys)=CNL˜C²N².

In another exemplary embodiment, the decoding procedure is performedover both a length (L) chain of each encoded qubit and C encoded qubitscomprising each logical qubit.

In another exemplary embodiment, the hardware resources include at leastone of qubits, couplers, and local fields.

In another exemplary embodiment, each logical cubit i (i=1, . . . , N)is represented by a C-tuple of encoded qubits (i, c), with c=1, . . . ,C.

In another exemplary embodiment, the hardware resources include nestedcouplers {tilde over (J)}_((i,c)(j,c′)) and local fields {tilde over(h)}_((i,c)) where:

-   -   {tilde over (J)}_((i,c)(j,c′))=J_(ij), ∀c,c′, i≠j,    -   {tilde over (h)}_((i,c)) Ch_(i), ∀c,i,    -   {tilde over (J)}_((i,c)(j,c′))=−γ, ∀c≠c′.

Additional embodiments of the invention are described in the descriptionand figures provided below.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1D shows an illustration of the nesting scheme of the variousembodiments.

FIGS. 2A-2C show results obtained with a D-Wave 2000Q quantum annealerfor the antiferromagnetic K₄ illustrating temperature scaling for 8nesting levels in accordance with the various embodiments.

FIG. 3A-3C show results obtained with a D-Wave 2000Q quantum annealerfor the antiferromagnetic K₄ illustrating temperature scaling for 13nesting levels in accordance with the various embodiments.

FIG. 4 shows an exemplary system for implementing the variousembodiments.

FIGS. 5A-5C show results obtained with a D-Wave 2000Q quantum annealerand with numerical simulations for the antiferromagnetic K₄, afterencoding, followed by ME and decoding, in accordance with the variousembodiments.

FIGS. 6A-6C show random antiferromagnetic K₈ results obtained with aD-Wave 2000Q quantum annealer and with numerical simulations, inaccordance with the various embodiments.

FIGS. 7A-7B show some experimental results obtained with a D-Wave 2000Qquantum annealer and with numerical simulations on four ensembles of 100fully connected weighted graphs on N=16 and N=24 variables which can beencoded with up to C=3 and C=2 nesting levels respectively on the D-Wave2000Q quantum annealer.

FIG. 8 is a plot showing the effect of separately optimizing γ for MEand penalties obtained with numerical simulations.

FIGS. 9A and 9B show saturation removal for NQAC applied toantiferromagnetic K₄ in accordance with the various embodiments obtainedwith numerical simulations.

FIGS. 10A-10C show parallel tempering (PT) results for antiferromagneticK4 with no noise on the couplers, in accordance with the variousembodiments.

FIGS. 11A and 11B show MEs of a K₃₂ in accordance with the variousembodiments.

FIGS. 12A and 12B show P_(C)(α) and adjusted P′_(C)(α), for the hard-K₈instance, obtained in accordance with the various embodiments with aD-Wave 2000Q quantum annealer.

FIGS. 12C and 12D show P_(C)(α) and adjusted P′_(C)(α), for the easy-K₈instance, obtained in accordance with the various embodiments with aD-Wave 2000Q quantum annealer.

FIGS. 13A and 13B show P_(C)(α) and adjusted P′_(C)(α), for the hard-K₁₀instance, obtained in accordance with the various embodiments with aD-Wave 2000Q quantum annealer.

FIGS. 13C and 13D show P_(C)(α) and adjusted P′_(C)(α), for the easy-K₁₀instance, obtained in accordance with the various embodiments with aD-Wave 2000Q quantum annealer.

FIGS. 14A-14D shows the optimal penalty strength as a function of theenergy scale for the instances considered obtained with a D-Wave 2000Qquantum annealer.

FIGS. 15A-15D show data collapse and μ_(C) scaling results for theantiferromagnetic hard-K₈ problem considered above, as well as theeasy-K₈ problem obtained with a D-Wave 2000Q quantum annealer.

FIGS. 16A-16D show data collapse and μ_(C) scaling results for theantiferromagnetic hard-K₁₀ problem considered above, as well as theeasy-K₁₀ problem obtained with a D-Wave 2000Q quantum annealer.

DETAILED DESCRIPTION

The present invention is described with reference to the attachedfigures, wherein like reference numerals are used throughout the figuresto designate similar or equivalent elements. The figures are not drawnto scale and they are provided merely to illustrate the instantinvention. Several aspects of the invention are described below withreference to example applications for illustration. It should beunderstood that numerous specific details, relationships, and methodsare set forth to provide a full understanding of the invention. Onehaving ordinary skill in the relevant art, however, will readilyrecognize that the invention can be practiced without one or more of thespecific details or with other methods. In other instances, well-knownstructures or operations are not shown in detail to avoid obscuring theinvention. The present invention is not limited by the illustratedordering of acts or events, as some acts may occur in different ordersand/or concurrently with other acts or events. Furthermore, not allillustrated acts or events are required to implement a methodology inaccordance with the present invention.

Motivated by the availability of commercial QA devices featuringhundreds of qubits, the various embodiments are directed to methods forerror correction for QA. There is a consensus that these devices aresignificantly and adversely affected by decoherence, noise, and controlerrors, which makes them particularly interesting for the study oftailored, practical error correction techniques. Such techniques, knownas quantum annealing correction (QAC) schemes, have already beenexperimentally shown to significantly improve the performance of quantumannealers, and theoretically analyzed using a mean-field approach.However, these QAC schemes are not easily generalizable to arbitraryoptimization problems since they induce an encoded graph that istypically of a lower degree than the qubit-connectivity graph of thephysical device. Moreover, they typically impose a fixed code distance,which limits their efficacy.

To overcome these limitations, the present disclosure presents a familyof error-correcting codes for QA, based on a “nesting” scheme, that hasthe following properties: (1) it can handle arbitrary Ising-modeloptimization problem, (2) it can be implemented on present-day QAhardware, and (3) it is capable of an effective temperature reductioncontrolled by the code distance. The “nested quantum annealingcorrection” (NQAC) scheme of the various embodiments thus provides avery general and practical tool for error correction in quantumoptimization.

In QA, the system undergoes an evolution governed by the followingtime-dependent, transverse-field Ising Hamiltonian:

H(t)=A(t)H _(X) +B(t)H _(P) , tε[0,t _(f)],  (1)

with respectively monotonically decreasing and increasing “annealingschedules” A(t) and B(t). The “driver Hamiltonian” H_(X)=Σ_(i)σ_(i) is atransverse field whose amplitude controls the tunneling rate. Thesolution to an optimization problem of interest is encoded in the groundstate of the Ising problem Hamiltonian H_(P), with

$\begin{matrix}{{H_{P} = {{\sum\limits_{i \in V}\; {h_{i}\sigma_{i}^{z}}} + {\sum\limits_{{({i,j})} \in ɛ}\; {J_{ij}\sigma_{i}^{z}\sigma_{j}^{z}}}}},} & (2)\end{matrix}$

where the sums run over the weighted vertices ν edges ε of a graph G=(ν,ε), and σ_(i) ^(x,z) denote the Pauli operators acting on qubit i.Available QA devices use an array of superconducting flux qubits tophysically realize the system described in Eqs. (1) and (2) on a fixed“Chimera” graph (see FIG. 1) with programmable local fields {h_(i)},couplings {J_(ij)}, and annealing time t_(f).

FIGS. 1A-1D shows an illustration of the nesting scheme. In FIGS. 1A and1C, a C-degree nested graph is constructed by embedding a K_(N) into aK_(C×N), with N=4 and C=1 (FIG. 1A) and C=4 (FIG. 1C). Red, thickcouplers are energy penalties defined on the nested graph between the(i, c) nested copies of each logical qubit i. FIGS. 1B and 1D show thenested graphs after ME on the Chimera graph. The thicker couplerscorrespond to the ferromagnetic chains introduced in the process.

The adiabatic theorem for closed systems guarantees that if the systemis initialized in the ground state of H(0)=A (0)H_(X), a sufficientlyslow evolution relative to the inverse minimum gap of H(t) will take thesystem with high probability to the ground state of the finalHamiltonian H(t_(f))=B(t_(f))H_(P). Dynamical errors then arise due todiabatic transitions, but they can be made arbitrarily small viaboundary cancellation methods that control the smoothness of A(t) andB(t), as long as the adiabatic condition is satisfied. This means thatin particular the probability of Landau-Zener transitions isexponentially suppressed, though of course t_(f) is still controlled byan inverse (cubic) power of the minimum gap. One can assume that theproblem of Landau-Zener transitions is addressed by such boundarycancellation methods [though the experiments described below do notinclude such methods since they could not be implemented with controlover the smoothness of A (t) and B (t) and focus here on addressing theerrors that occur in open systems. For the latter, specifically a systemthat is weakly coupled to a thermal environment, the final state is amixed state ρ(t_(f)) that is close to the Gibbs state associated withH(t_(f)) if equilibration is reached throughout the annealing process.In the adiabatic limit the open system QA process is thus better viewedas a Gibbs distribution sampler. The main goal of QAC is to suppress theassociated thermal errors and restore the ability of QA to act as aground state solver. In addition QAC should suppress errors due tonoise-driven deviations in the specification of H_(P).

Error correction is achieved in QAC by mapping the logical HamiltonianH(t) to an appropriately chosen encoded Hamiltonian H(t):

H (t)A(t)H _(X) +B(t) H _(P) , tε[0,t _(f)]  (3)

defined over a set of physical qubits N larger than the number oflogical qubits N=|ν|. Note that H _(P) also includes penalty terms, asexplained below. The logical ground state of H_(P) is extracted from theencoded system's state ρ(t_(f)) through an appropriate decodingprocedure. A successful error correction scheme should recover thelogical ground state with a higher probability than a directimplementation of H_(P), or than a classical repetition code using thesame number of physical qubits N. Due to practical limitations ofcurrent QA devices that prevent the encoding of H_(X), only H_(P) isencoded in QAC. In the future it may be possible to circumvent thislimitation using coupling to ancilla qubits. At present it results in atradeoff since it requires us to optimize the penalty strength, and mayalso result in a need to optimize the nesting degree, since withoutencoding H_(X) the minimum gap may shrink relative to the unencodedproblem.

In order to allow for the most general N-variable Ising optimizationproblem, a methodology in accordance with the various embodimentsdefines an encoding procedure for problem Hamiltonians Hp supported on acomplete graph K_(N). The first step of the construction involves a“nested” Hamiltonian {tilde over (H)}_(P) that is defined by embeddingthe logical K_(N) into a larger K_(C×N). The integer C is the “nestingdegree” and controls the amount of hardware resources (qubits, couplers,and local fields) used to represent the logical problem. {tilde over(H)}_(P) is constructed as follows. Each logical qubit i (i=1, . . . N)is represented by a C-tuple of encoded qubits (i, c), with c=1, . . . ,C. The “nested” couplers {tilde over (J)}_((i,c),(j,c′)) and localfields {tilde over (h)}_(i,c)) are then defined as follows:

{tilde over (J)} _((i,c),(j,c′)) =J _(ij) , ∀c,c′, i≠j,  (4a)

{tilde over (h)} _((i,c)) =Ch _(i) , ∀c,i,  (4b)

{tilde over (J)} _((i,c),(j,c′)) =−γ, ∀c≠c′.  (4c)

This construction is illustrated in FIGS. 1A and 1D. Each logicalcoupling J_(ij) has C² copies {tilde over (J)}_((i,c),(j,c′)), thusboosting the energy scale at the encoded level by a factor of C². Eachlocal field h_(i) has C copies {tilde over (h)}_((i,c)); the factor C inEq. (4b) ensures that the energy boost is equalized with the couplers.For each logical qubit i, there are C(C−1)/2 ferromagnetic couplings{tilde over (J)}_((i,c),(j,c′)) of strength γ>0 (to be optimized),representing energy penalties that promote agreement among the C encodedqubits, i.e., that bind the C-tuple as a single logical qubit i.

The second step of the construction is to implement the fully connectedproblem {tilde over (H)}_(P) on given QA hardware, with a lower-degreequbit connectivity graph. This requires a minor embedding (ME). Theprocedure involves replacing each qubit in {tilde over (H)}_(P) by aferromagnetically coupled chain of qubits, such that all couplings in{tilde over (H)}_(P) are represented by inter-chain couplings. Theintra-chain coupling represents another energy penalty that forces thechain qubits to behave as a single logical qubit. The physicalHamiltonian obtained after this ME step is the final encoded HamiltonianH _(P). One can minor-embed a K_(C×N) nested graph representing eachqubit (i, c) as a physical chain of length

$L = {\left\lbrack \frac{CN}{4} \right\rbrack + 1}$

on the Chimera graph. This is illustrated in FIGS. 1B and 1D. The numberof physical qubits necessary for a ME of a K_(C×N) on Chimera istherefore N_(phys)=CNL˜C²N²/4. More generally, the minor embeddings of aK_(C×N) requires a number of physical qubits that grows with C²N², withthe specific value of the proportionality constant depending on thespecific properties of the quantum hardware. For example, ¼ for Chimera.

At the end of a QA run implementing the encoded Hamiltonian H _(P) and ameasurement of the physical qubits, a decoding procedure must beemployed to recover the logical state. For the sake of simplicity oneneed only consider majority vote decoding over both the length-L chainof each encoded qubit (i, c) and the C encoded qubits comprising eachlogical qubit i (decoding over the length-L chain first, then over the Cencoded qubits, does not affect performance; see Partition FunctionCalculation in Supplemental Information (SI). The encoded and logicalqubits can thus be viewed as forming repetition codes with,respectively, distance L and C. Other decoding strategies are possiblewherein the encoded or logical qubits do not have this simpleinterpretation; e.g., energy minimization decoding, which tends tooutperform majority voting. In the unlikely event of a tie, one canassign a random value of +1 or −1 to the logical qubit.

Free Energy

Using a mean-field analysis that reduces the model to an equivalentclassical one by employing the Suzuki-Trotter formula, one can computethe partition function associated with the nested HamiltonianA(t)H_(x)+B (t){tilde over (H)}_(P) for the case with uniformantiferromagnetic couplings. This leads to the following free energydensity in the low temperature and thermodynamic limits (see Free Energyin SI):

βF=C ²β(√{square root over ([A(t)/C] ²+[2γB(t)m] ²)}−γB(t)m ²)  (5)

where m is the mean-field magnetization. There are two key noteworthyaspects of this result. First, the driver term is resealed as A(t)→C⁻¹A(t). This shifts the crossing between the A and B annealing schedulesto an earlier point in the evolution and is related to the fact that QACencodes only the problem Hamiltonian term proportional to B (t).Consequently the quantum critical point is moved to earlier in theevolution, which benefits QAC since the effective energy scale at thisnew point is higher. Second, the inverse temperature is resealed asβ→C²β. This corresponds to an effective temperature reduction by C², amanifestly beneficial effect. The same conclusion, of a lower effectivetemperature, is reached by studying the numerically computed successprobability associated with thermal distributions (see SI Sec. III).This prediction is born out by the experimental results, though it ismasked to some extent by complications arising from the ME and noise.

Effective Temperature Scaling for Optimization Applications

The important finding above is that the nesting scheme of the variousembodiments allows one to increase the energy scale of the problemHamiltonian implemented in a quantum annealing device. As discussedabove, this “energy boost” can be interpreted as an effective reductionin the temperature at which the device operates. Therefore, byimplementing NQAC it is then possible to reduce both thermal and controlerrors. NQAC gives the possibility to use an arbitrarily large amount ofphysical resources (number of used qubits) to lower effectivetemperature below an acceptable threshold. This aspect is crucialbecause although it is possible to scale the size of quantum annealingdevices, there are fundamental practical limits that prevent anarbitrary reduction of the physical temperature of a physical system.

The NQAC encoding is defined in terms of a nesting level C that controlsthe amount of protection against thermal and control errors. C alsocontrols the number of physical qubits N_(phys) used in a nestedencoding scales as N_(phys)˜C². In the discussion above, it is shownthat it is possible to obtain an energy boost μ_(C) that scalespolynomially with the number of physical qubits used: μ_(C)˜C^(η) (η<2).This is equivalent to an effective temperature reduction T→T/μ_(C). Thisscaling law is demonstrated herein on a D-Wave 2000Q quantum annealerprocessor featuring 504 active flux-qubits. The size of this processorallowed confirmation of the scaling law up to C=8, with a scalingcoefficient η≅0.52. This is illustrated in FIGS. 2A-2C.

FIG. 2A-2C show results for the antiferromagnetic K₄. FIG. 2A shows DWTwo success probabilities P_(C)(α) for 8 nesting levels C. Increasing Cgenerally increases P_(C)(α) at fixed α. FIG. 2 B shows rescaledP_(C)(αμ_(C)) data, exhibiting data-collapse. FIG. 2C shows scaling ofthe energy boost μ_(C) vs C. In FIGS. 2A-2C, N_(phys)ε[8, 288].

It is crucial to confirm that the temperature reduction achieved throughNQAC is truly scalable, e.g. the scaling law T_(phys)˜μ_(C) ⁻¹˜C^(−η) isvalid for arbitrarily large values of the nesting level C. Experimentsrecently confirmed that this is indeed the case for nesting levels up toC=13 by implementing NQAC on the latest-generation D-Wave quantumannealer (D-Wave 2000Q) with 2023 available flux-qubits. The resultingscaling coefficient in this case is η≅0.35 and is demonstrated by theexperimental data shown FIGS. 3A-3C.

FIGS. 3A-3C show results for the antiferromagnetic K₄. In particular,FIG. 3A shows D-Wave 2000Q success probabilities P_(C)(α) for 13 nestinglevels C. Increasing C generally increases P_(C)(α) at fixed α. FIG. 3Bshows rescaled P_(C)(αμ_(C)) data, exhibiting data-collapse. FIG. 3Cshows scaling of the energy boost μ_(C) versus C². In FIGS. 3A-3C,N_(phys) ε[8, 728].

Effective Temperature Scaling for Machine-Learning Applications

A Boltzmann machine is a generative probabilistic model that can be usedfor both supervised and non-supervised machine-learning applications.Moreover, it can be used as a building block for deep-belief networksthus playing a role in the booming fields of artificial intelligence anddeep-learning. A Boltzmann machine associates a given data pointz≡{z_(i)} (here represented as a string z_(i)=±1, i=1, . . . , N) to anenergy function E (z):

$\begin{matrix}{{E(z)} = {{\sum\limits_{i \in V}\; {b_{i}z_{i}}} + {\sum\limits_{{({i,j})} \in ɛ}{w_{ij}z_{i}{z_{j}.}}}}} & (6)\end{matrix}$

and a corresponding probability distribution P (z):

$\begin{matrix}{{P(z)} = {{e^{- {E{(z)}}}/Z} = {\sum\limits_{z}\; {e^{- {E{(z)}}}.}}}} & (7)\end{matrix}$

Notice that the quantity above is defined on the vertices ν and edges εof some graph G=(ν, ε). A Boltzmann machine is thus also a graphicalmodel. Training a Boltzmann machine consists of finding the values ofthe weights b_(i) and w_(ij) such that the probability distribution P(z)generated by the model is as close as possible to the probabilitydistribution of the training set. The training of a Boltzmann machine isachieved by iteratively adjusting the weights of the model according tothe following update rules:

δb _(i) ˜

z _(i)

_(D) −

z _(i)

_(S) , δw _(ij) ˜

z _(i) z _(j)

_(D) −

z _(i) z _(j)

_(S)  (8)

In (8), the first term in each expression (

z

_(D), term) is the expectation value as measured on the training set andthe second term in each expression (

z

_(S) term) is the expectation value as measured using P(z). Computingthe second term is known to be computationally hard with classicalalgorithms, as the computation involves the computation of thermalexpectation values of linear and quadratic functions of the variablesz_(i).

The update rules could be computed by a physical device that efficientlysample from a thermal distribution, e.g. it can sample states accordingto the probability distribution defined above. Recently, the scientificcommunity has begun to use quantum annealers for the above-mentionedsampling task. More precisely, the outcome distribution of statesobtained by running a quantum annealing device implementing a problemHamiltonian as in Eq. 2 is a good approximation of the thermalprobability distribution above with:

β_(eff) h _(i) =b _(i), β_(eff) J _(ij) =w _(ij).  (9)

The existence of an effective temperature T_(eff)=1/β_(eff) at which theannealer samples is due to the non-trivial early freezing dynamics thattakes place toward the end of the annealing process and should bedetermined experimentally.

Below is provided evidence that two major limitations of using quantumannealing devices in solving machine-learning problems using Boltzmannmachines can be overcome:

-   -   a. Limited connectivity. Following the procedure described        above, quantum annealers should be used to train Boltzmann        machines whose graph matches (a subgraph of) the connectivity        graph of the device (see the definition of Chimera graph        herein). It is shown herein that it is indeed possible to train        a Boltzmann machine on fully connected graphs after a minor        embedding procedure as described herein.    -   b. Limited coupling strength and precision. Quantum annealing        devices have technical limitations that impose a largest allowed        magnitude for the local fields and couplings: |h_(i)|<h_(max),        |J_(ij)|<J_(max) For a given machine-learning problem, the        optimal values of the weights may exceed these maximal values.        Indeed, it is expected that β_(eff) is reduced with the        complexity of the problem, thus requiring larger values of the        couplings h_(i) and J_(ij) for a correct training of a Boltzmann        machine with quantum annealers. It is shown herein that it is        possible to use the NQAC encoding to boost the effective value        of the couplings: (h_(i),J_(ij))→μ′_(C)(h_(i),J_(ij)). This        energy boost μ_(C) plays a similar role to the energy boost        defined above in connection with optimization problems.

Thus, quantum annealers can be used to sample from a thermaldistribution of a fully connected Boltzmann machine and an NQAC encodingcan boost the strength of the couplers (h_(i),J_(ij)), thus achieving areduction of the effective temperature at which the annealer samples.

The effective inverse temperature β_(C,eff) can be derived that entersthe relation

β_(C,eff)(h _(i) ,J _(ij))≡(b _(i) ,w _(ij))  (10)

and show its dependence as a function of the nesting level C. To do so,one can first define a reduced probability distribution for degeneratestates:

$\begin{matrix}{{p\left( E_{i} \right)} = {\sum\limits_{{z{E{(z)}}} = E_{i}}\; {{P(z)}.}}} & (11)\end{matrix}$

Once can now compute numerically the reduced probability distributionp_(T)(ρ,E_(i)) above (thermal case) with the weights b₁=h_(i)/ρ andwij=J_(ij)/ρ and the experimental distribution p_(DW)(C, E_(i)) computedby implementing a fully connected graph with C nesting levels withcouplings (h_(i),J_(ij)) in accordance with the quantum annealer to beused. The effective inverse temperature β_(C,eff) is obtained byminimizing the following distribution distance:

$\begin{matrix}{\beta_{C,{eff}} = {{\arg {\min\limits_{\rho}\left( {\frac{1}{2}{\sum\limits_{i}\; {{{p_{DW}\left( {C,E_{i}} \right)} - {p\left( {\rho,E_{i}} \right)}}}}} \right)}} \equiv {\arg {\min\limits_{\rho}{{D\left( {p_{DW},p_{T}} \right)}.}}}}} & (12)\end{matrix}$

In other words, the effective sampling (inverse) temperature β_(C,eff)at which the quantum annealer operates is such that the theoreticalreduced (thermal) energy distribution p(ρ, E_(i)) is as close aspossible to the output distribution obtained from the quantum annealer.

Now having obtained the effective sampling temperature, one candetermine the “quality” of the distribution obtained from the quantumannealer. In other words, establish if the experimental distribution isa good approximation of a thermal distribution. The figure-of-merit isthe “gradient overlap” between the numerically and experimentallydetermined variations of the weights. Recalling the discussion aboveabout the training of a Boltzmann machine, the weights are adjustedaccording to the following gradient:

δw _(ij)˜∇_(ij) ^(T) ≡

z _(i) z _(j)

_(T).  (13)

The subscript is a reminder that the average above is the theoretical(or thermal) gradient. Similarly, one can compute the experimentalgradient using the quantum annealer:

∇_(ij) ^(DW) ≡

z _(i) z _(j)

_(DW).  (14)

One can treat both ∇_(ij) ^(T) and ∇_(ij) ^(DW) as vectors and computetheir normalized overlap:

$\begin{matrix}{O = {\frac{\prod\limits_{ij}\; {\nabla_{ij}^{T}\nabla_{ij}^{DW}}}{\prod\limits_{ij}\; {\nabla_{ij}^{2T}{\prod\limits_{i^{\prime}j^{\prime}}\; \nabla_{i^{\prime}j^{\prime}}^{2\; {DW}}}}}.}} & (15)\end{matrix}$

One can consider the gradient overlap O because it does not depend onthe normalization of the gradients, which correspond to a less importantlearning rate. A gradient overlap O close to 1 means that theexperimentally determined update is very close to the exact and ensuresthat the training update is performed along the same direction in theparameter space when using both the exact and experimental gradients.

FIG. 4 illustrates an exemplary system for implementing the variousembodiments. In particular, FIG. 4 illustrates a hybrid computing system100 including a digital computer 105 coupled to an analog computer 150.In some implementations analog computer 150 is a quantum processor. Theexemplary digital computer 105 includes a digital processor (CPU) 110that may be used to perform classical digital processing tasks.

Digital computer 105 may include at least one digital processor (such ascentral processor unit 110 with one or more cores), at least one systemmemory 120, and at least one system bus 117 that couples various systemcomponents, including system memory 120 to central processor unit 110.

The digital processor may be any logic processing unit, such as one ormore central processing units (“CPUs”), graphics processing units(“GPUs”), digital signal processors (“DSPs”), application-specificintegrated circuits (“ASICs”), programmable gate arrays (“FPGAs”),programmable logic controllers (PLCs), etc., and/or combinations of thesame.

Unless described otherwise, the construction and operation of thevarious blocks shown in FIG. 4 are of conventional design. As a result,such blocks need not be described in further detail herein, as they willbe understood by those skilled in the relevant art.

Digital computer 105 may include a user input/output subsystem 111. Insome implementations, the user input/output subsystem includes one ormore user input/output components such as a display 112, mouse 113,and/or keyboard 114.

System bus 117 can employ any known bus structures or architectures,including a memory bus with a memory controller, a peripheral bus, and alocal bus. System memory 120 may include non-volatile memory, such asread-only memory (“ROM”), static random access memory (“SRAM”), FlashNAND; and volatile memory such as random access memory (“RAM”) (notshown).

Digital computer 105 may also include other non-transitory computer- orprocessor-readable storage media or non-volatile memory 115.Non-volatile memory 115 may take a variety of forms, including: a harddisk drive for reading from and writing to a hard disk, an optical diskdrive for reading from and writing to removable optical disks, and/or amagnetic disk drive for reading from and writing to magnetic disks. Theoptical disk can be a CD-ROM or DVD, while the magnetic disk can be amagnetic floppy disk or diskette. Non-volatile memory 115 maycommunicate with digital processor via system bus 117 and may includeappropriate interfaces or controllers 116 coupled to system bus 117.Non-volatile memory 115 may serve as long-term storage for processor- orcomputer-readable instructions, data structures, or other data(sometimes called program modules) for digital computer 105.

Although digital computer 105 has been described as employing harddisks, optical disks and/or magnetic disks, those skilled in therelevant art will appreciate that other types of non-volatilecomputer-readable media may be employed, such magnetic cassettes, flashmemory cards, Flash, ROMs, smart cards, etc. Those skilled in therelevant art will appreciate that some computer architectures employvolatile memory and non-volatile memory. For example, data in volatilememory can be cached to non-volatile memory. Or a solid-state disk thatemploys integrated circuits to provide non-volatile memory.

Various processor- or computer-readable instructions, data structures,or other data can be stored in system memory 120. For example, systemmemory 120 may store instruction for communicating with remote clientsand scheduling use of resources including resources on the digitalcomputer 105 and analog computer 150. Also for example, system memory120 may store at least one of processor executable instructions or datathat, when executed by at least one processor, causes the at least oneprocessor to execute the various algorithms described elsewhere herein,including machine learning related algorithms.

In some implementations system memory 120 may store processor- orcomputer-readable calculation instructions to perform pre-processing,co-processing, and post-processing to analog computer 150. System memory120 may store at set of analog computer interface instructions tointeract with analog computer 150.

Analog computer 150 may include at least one analog processor such asquantum processor 140. A quantum processor is a computing device thatcan harness quantum physical phenomena (such as superposition,entanglement, and quantum tunneling) unavailable to non-quantum devices.A quantum processor may take the form of a superconducting quantumprocessor. A superconducting quantum processor may include a number ofqubits and associated local bias devices, for instance two or moresuperconducting qubits. An example of a qubit is a flux qubit. Asuperconducting quantum processor may also employ coupling devices(i.e., “couplers”) providing communicative coupling between qubits.Further details and embodiments of exemplary quantum processors that maybe used to implement the various embodiments are described in, forexample, U.S. Pat. Nos. 7,533,068; 8,008,942; 8,195,596; 8,190,548; and8,421,053. However the various embodiments are not limited to suchquantum processors and other types of quantum processors can be usedwithout limitation.

Quantum processor 140 may be a general quantum processor or a morespecialized quantum processor, such as a quantum annealing processor.Analog computer 150 can be provided in an isolated environment, forexample, in an isolated environment that shields the internal elementsof the quantum computer from heat, magnetic field, and other externalnoise (not shown). The isolated environment may include a refrigerator,for instance a dilution refrigerator, operable to cryogenically cool theanalog processor, for example to temperature below approximately 1°Kelvin.

In an exemplary implementation of a system in accordance with thevarious embodiments, a user would submit the problem to be solved to thedigital computer 105 via user interface 111. The digital computer 105would then convert or encode the problem, as needed, and provide toanalog computer 150, via controller 116, for processing using thequantum processor 140, When the processing run is completed,measurements of the qubits at quantum processor 140 are transmitted backto the digital computer 105 for subsequent decoding and processing.

Examples

The examples shown here are not intended to limit the variousembodiments. Rather they are presented solely for illustrative purposes.

NQAC Optimization Results

NQAC was tested by studying antiferromagnetic complete graphsnumerically, as well as on a D-Wave 2000Q processor featuring 504 fluxqubits connected by 1427 tunable composite qubits acting asIsing-interaction couplings, arranged in a non-planar Chimera-graphlattice (complete graphs were also studied for a spin glass model). Thedata discussed below demonstrates that the encoding schemes of thevarious embodiments yield a steady improvement for the probability ofreaching the ground state as a function of the nesting degree, evenafter minor-embedding the complete graph onto the physical graph of thequantum annealer. Also demonstrated is that NQAC outperforms classicalrepetition code schemes that use the same number of physical qubits.

For purposes of illustrating the various embodiments, the followingdiscussion is directed to the more significant results of the testing.However, in the section entitled “Supplemental Information,” a moredetailed explanation is provided with additional results and detailsregarding the testing of NQAC.

The hardness of an Ising optimization problem, using a QA device, iscontrolled by its size N as well as by an overall energy scale α. Thesmaller this energy scale, the higher the effective temperature and themore susceptible QA becomes to (dynamical and thermal) excitations outof the ground state and misspecification noise on the problemHamiltonian. This provides us with an opportunity to test NQAC. Since inthe experiments were limited by the largest complete graph that can beembedded on the D-WAVE 2000Q device, a K₃₂ (see SI Sec. IV for details),the hardness of a problem was tuned by studying the performance of NQACas a function of a via H_(P)→αH_(P), with 0<α<1. Note that there was norescale of γ; instead γ was optimized for optimal post-decodingperformance (see SI Sec. V). It is known that for the D-WAVE 2000Q,intrinsic coupler control noise can be taken to be Gaussian withstandard deviation σ˜0.05 of the maximum value for the couplings [48].Thus, one may expect that, without error correction, Ising problems withα<0.05 are dominated by control noise.

NQAC was applied to completely antiferromagnetic (h_(i)=0 ∀_(i)) Isingproblems over K₄(J_(i,j)=1 ∀i,j), and K₈ (random J_(ij)ε[0.1,1] withsteps of 0.1) with nesting up to C=8 and C=4, respectively. P_(c)(α)denotes the probability to obtain the logical ground state at energyscale α for the C-degree nested implementation (see SI Sec. I for datacollection methods). The results of these experiments are shown in FIGS.5A, 5B, and 5C.

FIGS. 5A-5C show results for the antiferromagnetic K₄, after encoding,followed by ME and decoding. FIG. 5A shows D-WAVE 2000Q successprobabilities P_(C)(α) for eight nesting degrees C. Increasing Cgenerally increases P_(C)(α) at fixed α. FIG. 5B shows rescaledP_(C)(αμ_(C)) data, exhibiting data-collapse. FIG. 5C shows scaling ofthe energy boost μ_(C) versus the maximal energy boost μ_(C) ^(max), forboth the D-WAVE 2000Q and SQA. In these figures, purple circles showD-WAVE 2000Q results; blue stars show SQA for the case of no ME (i.e.,for the problem defined directly over K_(C×N) and no coupler noise); redup-triangles show SQA for the Choi ME (for a full Chimera graph), withσ=0.05 Gaussian noise on the couplings; and yellow right-triangles showSQA for the D-WAVE 2000Q heuristic ME (applied to a Chimera graph with 8missing qubits) with σ=0.05 Gaussian noise on the couplings. As will bediscussed further below, the flattening of μ_(C) suggests that theenergy boost becomes less effective at large C. However, this can beremedied by increasing the number of SQA sweeps (see SI Sec. III), fixedhere at 10⁴. Thus the lines represent best fits to only the first fourdata points, with slopes 0.98, 0.91, 0.62 and 0.69 respectively. InFIGS. 4A-4C, N_(phys) ε[8, 488].

The experimental QA data in FIG. 5A shows a monotonic increase ofP_(C)(α) as a function of the nesting degree C over a wide range ofenergy scales α. As expected, P_(C)(α) drops from P_(C)(1)=1 (solutionalways found) to P_(C)(0)=6/16 (random sampling of 6 ground states,where 4 out of the 6 couplings are satisfied, out of a total of 16states).

Note that P₁(α) (no nesting) drops by ˜50% when α˜0.1, which isconsistent with the aforementioned σ˜0.05 control noise level, whileP₈(α) exhibits a similar drop only when α˜0.01. This suggests that NQACis particularly effective in mitigating the two dominant effects thatlimit the performance of quantum annealers: thermal excitations andcontrol errors. To investigate this more closely, FIG. 5B shows that thedata from the left panel can be collapsed via P_(C)(α)→P_(C)(α/μ_(C)),where μ_(C) is an empirical rescaling factor discussed below (see alsoSI Sec. VI). This implies that P₁(μ_(C)α)≈P_(C)(α), and hence that theperformance enhancement obtained at nesting degree C can be interpretedas an energy boost α→μ_(C)α with respect to an implementation withoutnesting.

The existence of this energy boost is a key feature of NQAC, asanticipated above. Recall [Eq. (4)] that a nested graph K_(C×N) containsC² equivalent copies of the same logical coupling J_(ij). Hence adegree-C nesting before ME can provide a maximal energy boost μ_(C)^(max), with η_(max)=4. This simple argument agrees with the reductionof the effective temperature by C² based on the calculation of the freeenergy (5). FIG. 5C shows μ_(C) as a function of μ_(max), yieldingμ_(C)˜C^(η) with η≈1.37 (grey circles). To understand why η<η^(max),simulated quantum annealing (SQA) simulations were performed (see SISec. VII for details). Once can observe in FIG. 5C that without ME andcontrol errors, the boost scaling matches μ_(C) ^(max) (dark stars).When including ME and control errors a performance drop results (darktriangles). Both factors thus contribute to the sub-optimal energy boostobserved experimentally. However, the optimal energy boost is recoveredfor a fully thermalized state with a sufficiently large penalty (see SISec. III). To match the experimental D-WAVE 2000Q results using SQA theChoi ME designed for full Chimera graphs was replaced by the heuristicME designed for Chimera graphs with missing qubits, and achieve a nearmatch (light triangles) (see SI Sec. IV for more details on ME).

Performance of NQAC Vs Classical Repetition

Recall that N_(C) ^(phys)=CNL is the total number of physical qubitsused at nesting degree C; let C_(max) denote the highest nesting degreethat can be accommodated on the QA device for a given K_(N), i.e.,C_(max)NL≦N_(tot)<(C_(max)+1)NL, where N_(tot) is the total number ofphysical qubits (504 in the experiments). Then M=[N_(C) _(max)^(phys)/N_(C) ^(phys)] is the number of copies that can be implementedin parallel. For NQAC at degree C to be useful, it must be moreeffective than a classical repetition scheme where M copies of theproblem are implemented in parallel. If a single implementation hassuccess probability P_(C)(α), the probability to succeed at least oncewith M statistically independent implementations isP_(C′)(α)=1−[1−P_(C)(α)]^(M) ^(C) . It turns out that theantiferromagnetic K₄ problem, for which a random guess succeeds withprobability 6/16, is too easy [i.e., P_(C)′(α) approaches 1 toorapidly], and therefore once can consider a harder problem: anantiferromagnetic K₈ instance with couplings randomly generated from theset J_(ij)ε{0.1, 0.2, . . . , 0.9, 1} (see SI Sec. V for more detailsand data on this and additional instances). Problems of this type turnout to have a sufficiently low success probability for purposes of themethodology of the various embodiments, and can still be nested up toC=4 on the D-Wave 2000Q processor.

FIGS. 6A-6C show random antiferromagnetic K₈ results. FIG. 5A showssuccess probabilities P_(C)(α) for four nesting degrees. FIG. 6B showssuccess probabilities P_(C)′(α) adjusted for classical repetition. FIG.6C shows numerical results for SQA simulations with 20000 sweeps, σ=0.05Gaussian noise on the couplings, and with the Choi embedding, showingfive nesting degrees. The inset in FIG. 6C shows scaling of the energyboost μ_(C) versus the maximal energy boost μ_(C) ^(max), for both theD-Wave 2000Q and SQA. In these figures, circles show D-Wave 2000Qresults; crosses and up-triangles show SQA for the Choi ME with 10000(crosses) and 20000 (up-triangles) sweeps, and with σ=0.05 Gaussiannoise on the couplings. The flattening of μ_(C) for C>4 suggests thatthe energy boost becomes less effective at large C, but increasing thenumber of sweeps recovers the effectiveness. The lines represent bestfits to only the first four data points, with respective slopesη/2=0.65, 0.75, and 0.85.

As noted above, results for P_(C)(α) are shown in FIG. 6A, and againincrease monotonically with C, as in the K₄ case. For each C, P_(C)(α)peaks at a value of α for which the maximum allowed strength of theenergy penalties γ=1 is optimal (γ>1 would be optimal for larger α, asshown in SI Sec. V; the growth of the optimal penalty with problem size,and hence chain length, is a typical feature of minor-embeddedproblems). An energy-boost interpretation of the experimental data ofFIG. 6A is possible for α values to the left of the peak; to the rightof the peak, the performance is hindered by the saturation of the energypenalties.

FIG. 6B compares the success probabilities P_(C)′(α) adjusted forclassical repetition, having set C_(max)=4, and shows thatP₂′(α)>P₁′(α), i.e., even after accounting for classical parallelism C=2performs better than C=1. However, P₄′(α)<P₃′(α)≦P₂′(α), so noadditional gain results from increasing C in the experiments. This canbe attributed to the fact that even the K₈ problem still has arelatively large P₁(α). Experimental tests on QA devices with morequbits will thus be important to test the efficacy of higher nestingdegrees on harder problems.

To test the effect of increasing C, and also to study the effect ofvarying the annealing time, FIG. 6C presents the performance of SQA on arandom K₈ antiferromagnetic instance with the Choi ME. The results arequalitatively similar to those observed on the D-WAVE 2000Q processorwith the heuristic ME [FIG. 6A]. Interestingly, one can observe a dropin the peak performance at C=5 relative to the peak observed for C=4.One can attribute this to both a saturation of the energy penalties anda suboptimal number of sweeps. The latter is confirmed in the inset inFIG. 3C, where one can observe that the scaling of μ_(C) with C isbetter for the case with more sweeps, i.e., again μ_(C)˜C^(η), and ηincreases with the number of sweeps.

Nested QAC offers several significant improvements over previousapproaches to the problem of error correction for QA. It is a flexiblemethod that can be used with any optimization problem, and allows theconstruction of a family of codes with arbitrary code distance. Theresults show that nesting is effective by performing studies with aD-Wave QA device and numerical simulations. Further, the protection fromerrors provided by NQAC can be interpreted as arising from an increase(with nesting degree C) in the energy scale at which the logical problemis implemented. This represents a very useful tradeoff: the effectivetemperature drops as one increases the number of qubits allocated to theencoding, so that these two resources can be traded. Thus NQAC can beused to combat thermal excitations, which are the dominant source oferrors in open-system QA, and are the bottleneck for scalable QAimplementations, assuming that closed-system Landau-Zener transitionshave been suppressed using other methods. Also demonstrated is that anappropriate nesting degree can outperform classical repetition with thesame number of qubits, with improvements to be expected whennext-generation QA devices with larger numbers of physical qubits becomeavailable.

NQAC Sampling Results

In FIGS. 7A and 7B there are shown some experimental results obtained onfour ensembles of 100 fully connected weighted graphs on N=16 and N=24variables which can be encoded with up to C=3 and C=2 nesting levelsrespectively on the D-WAVE 2000Q000Q quantum annealer.

FIG. 7A shows the increase of the sampling inverse temperature as afunction of the nesting level C. FIG. 7B shows the gradient overlapremaining close to 1 as a function of the nesting level C.

The weights of the instances of the ensembles are randomly picked withinthe set {±1} (int) and {±0.1, ±0.2, . . . , ±1} (frac). FIG. 7A showsthe monotonic boost of the effective (inverse) sampling temperatureβ_(C,eff) as a function of the nesting level. In FIG. 7B, the gradientsoverlaps are shown. The gradient overlaps are remarkably close to 1, forall ensembles and levels of nesting.

FIGS. 7A and 7B give evidence that non-native (minor embedded) graphicalmodels can be trained with quantum annealers and that NQAC can beadditionally implemented to effectively boost the value of the physicalcouplings (h_(i),J_(ij))→μ′_(C)(h_(i), J_(ij)), whereμ′_(C)≡β_(C,eff)/β_(1,eff) thus overcoming a fundamental technicallimitation of quantum annealing devices.

While various embodiments of the present invention have been describedabove and the in the Supplemental Information section that follows, itshould be understood that they have been presented by way of exampleonly, and not limitation. Numerous changes to the disclosed embodimentscan be made in accordance with the disclosure herein without departingfrom the spirit or scope of the invention. Thus, the breadth and scopeof the present invention should not be limited by any of the abovedescribed embodiments. Rather, the scope of the invention should bedefined in accordance with the following claims and their equivalents.

Although the invention has been illustrated and described with respectto one or more implementations, equivalent alterations and modificationswill occur to others skilled in the art upon the reading andunderstanding of this specification and the annexed drawings. Inaddition, while a particular feature of the invention may have beendisclosed with respect to only one of several implementations, suchfeature may be combined with one or more other features of the otherimplementations as may be desired and advantageous for any given orparticular application.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of the invention. Asused herein, the singular forms “a”, “an” and “the” are intended toinclude the plural forms as well, unless the context clearly indicatesotherwise. Furthermore, to the extent that the terms “including”,“includes”, “having”, “has”, “with”, or variants thereof are used ineither the detailed description and/or the claims, such terms areintended to be inclusive in a manner similar to the term “comprising.”

Unless otherwise defined, all terms (including technical and scientificterms) used herein have the same meaning as commonly understood by oneof ordinary skill in the art to which this invention belongs. It will befurther understood that terms, such as those defined in commonly useddictionaries, should be interpreted as having a meaning that isconsistent with their meaning in the context of the relevant art andwill not be interpreted in an idealized or overly formal sense unlessexpressly so defined herein.

Supplemental Information

The following section is provided supplement and enhance understandingof the results discussed above. However, nothing in this section isintended to limit the various embodiments.

I. Experimental Methods

As noted above in the Examples section, NQAC was tested on the D-WAVE2000Q quantum annealing device at the University of SouthernCalifornia's Information Sciences Institute (USC-ISI), which has beendescribed in numerous previous publications (e.g., see [1]). The largestcomplete graph that can be embedded on this device, featuring 504 activequbits, is a K₃₂.

An experimental value of the success probability P_(C)(α,γ) wasdetermined as a function of the energy penalty strength γ. All figuresshow, whenever the γ dependence is not explicitly considered, theoptimal value P_(C)(α)=max_(γ) P_(C)(α,γ), with γε{0.05, 0.1, 0.2, . . ., 0.9, 1}. One can used the same penalty value for both the nesting andthe ME. In principle these two values can be optimized separately forimproved performance, but this was not pursued since the resultingimprovement is small, as shown in FIG. 5, and costly since each instanceneeds to be rerun at all penalty settings.

FIG. 8 shows the effect of separately optimizing γ for ME and penalties.The plot shows the success probability from SQA simulations, for NQACapplied to a random antiferromagnetic K₈ with 10,000 sweeps, σ=0.05noise, Choi embedding, with β=0.1. The results obtained after separatelyoptimizing the penalty for the nesting and for the ME are denoted“non-unif”, while the results for using a single penalty for both (thestrategy used in the main text) is denoted “unif”. The former results ina small improvement. Also shown is that separate (“MV ME”) or joint (“MVall”) majority vote decoding of the nesting and the ME has no effect.

Each P_(C)(α,γ) is the overall success probability after 2×10⁴ annealingruns obtained by implementing 20 programming cycles of 10³ runs each. Asufficiently large number of programming cycles is necessary to averageout intrinsic control errors (ICE) that, as explained in the main text,prevent the physical couplings to be set with a precision better than˜5%. To further remove possible sources of systematic noise, at eachprogramming cycle a random gauge transformation is performed on thevalues of the physical qubits. A permutation of the C×N vertices is asymmetry of the nested graph but it is not a symmetry of the encodedHamiltonian obtained after ME. This is because the C×N chains ofphysical qubits are physically distinguishable. In each programmingcycle also performed was a random permutation of the vertices of thenested graph, before proceeding to the ME. Error bars correspond to thestandard error of the mean of the 20 P_(C)(α) values.

II. Mean Field Analysis of the Partition Function

In this section how to compute the partition function of the logicalproblem [Eq. (3)] is sketched, in order to analyze the effect ofnesting. Previously, considered were Hamiltonians of the form:

$\begin{matrix}{H = {{B(t)}\left( {H^{x} + H^{z}} \right)}} & \left( {S{.1}} \right) \\{where} & \; \\{{H^{x} = {{\left\lbrack {{A(t)}/{B(t)}} \right\rbrack H_{X}} = {{- {\Gamma (t)}}{\sum\limits_{i = 1}^{N}{\sum\limits_{c_{i} = 1}^{C}\sigma_{{ic}_{i}}^{x}}}}}}\begin{matrix}{H^{z} = {{\overset{\_}{H}}_{P} = {\sum\limits_{i,{j = 1}}^{N}{\sum\limits_{c_{i},{c_{j}^{\prime} = 1}}^{C}{J_{{({ic}_{i})},{({jc}_{j}^{\prime})}}\sigma_{{ic}_{i}}^{z}\sigma_{{jc}_{j}^{\prime}}^{z}}}}}} & {{~~~~~~~~~~~~~~~~~~~~~~~~~~~}\left( {S{.2}b} \right)} \\{{= {{\frac{J}{N}{\sum\limits_{i \neq j}{\sum\limits_{c_{i},{c_{j}^{\prime} = 1}}^{C}{\sigma_{{ic}_{i}}^{z}\sigma_{{jc}_{j}^{\prime}}^{z}}}}} - {\gamma {\sum\limits_{i = 1}^{N}{\sum\limits_{c_{i} \neq c_{i}^{\prime}}{\sigma_{{ic}_{i}}^{z}\sigma_{{ic}_{i}^{\prime}}^{z}}}}}}},} & {\left( {S{.2}c} \right)}\end{matrix}} & \left( {S{.2}a} \right)\end{matrix}$

A(t), B(t) have dimensions of energy, and where J and γ aredimensionless, and have each absorbed a factor of 1/2 to account fordouble counting. Note that both H^(x) and H^(z) are extensive(proportional to N). Throughout σ_(ic) ^(z)≡σ_(ic) _(i) ^(z)(σ_(ic)^(x)≡σ_(ic) _(i) ^(x)) us used to denote the Pauli z(x) operator actingon physical qubit c of encoded qubit i.

One can define the collective variables

$\begin{matrix}{{S_{i}^{x} \equiv {\frac{1}{C}{\sum\limits_{c_{i} = 1}^{C}\sigma_{{ic}_{i}}^{x}}}},\mspace{31mu} {S_{i}^{z} \equiv {\frac{1}{C}{\sum\limits_{c_{i} = 1}^{C}\sigma_{{ic}_{i}}^{z}}}},{S^{x} \equiv {\frac{1}{N}{\sum\limits_{i = 1}^{N}S_{i}^{x}}}},\mspace{31mu} {S^{z} \equiv {\frac{1}{N}{\sum\limits_{i = 1}^{N}{S_{i}^{z}.}}}}} & \left( {S{.3}} \right)\end{matrix}$

Once can interpret S_(i) ^(x) and S_(i) ^(z) as the mean transverse andlongitudinal fields on logical qubit i, respectively. Then

$\begin{matrix}{\mspace{79mu} {{H^{x} = {{{- {\Gamma (t)}}C{\sum\limits_{i = 1}^{N}S_{i}^{x}}} = {{- {NC}}\; {\Gamma (t)}S^{x}}}},}} & \left( {S{.4}} \right) \\{\mspace{85mu} {and}} & \; \\{{{\overset{\_}{H}}_{P} = {{\frac{J}{N}C^{2}{\sum\limits_{i,j}{S_{i}^{z}S_{j}^{z}}}} - {\left( {\frac{J}{N} + \gamma} \right){\sum\limits_{i = 1}^{N}{\sum\limits_{c_{i},c_{i}^{\prime}}{\sigma_{{ic}_{i}}^{z}\sigma_{{ic}_{i}^{\prime}}^{z}}}}} + {\gamma {\sum\limits_{i = 1}^{N}{\sum\limits_{c_{i}}\left( \sigma_{{ic}_{i}}^{z} \right)^{2}}}}}},} & \left( {S{.5}} \right)\end{matrix}$

but the last term is a constant [equal to γNC

], so it can be ignored. Therefore, up to a constant:

$\begin{matrix}{{{\overset{\_}{H}}_{P} = {{JNC}^{2}\left( {\left( S^{z} \right)^{2} - {\lambda \frac{1}{N}{\sum\limits_{i = 1}^{N}\left( S_{i}^{z} \right)^{2}}}} \right)}},} & \left( {S{.6}} \right) \\{where} & \; \\{{\lambda = {{\frac{\gamma}{J} + \frac{1}{N}} \geq 0}},} & \left( {S{.7}} \right)\end{matrix}$

encodes the penalty strength; the 1/N correction will disappear in thethermodynamic limit. Note that

${{\frac{1}{N}{\sum\limits_{i = 1}^{N}\left( S_{i}^{z} \right)^{2}}} = {O(1)}},$

so that

${{\lambda \frac{1}{N}{\sum\limits_{i = 1}^{N}\left( S_{i}^{z} \right)^{2}}} = {O(1)}},$

like (S_(i) ^(z))², and hence H _(P) is extensive in N, as it should be.

The form (S.6) for H _(P) shows that the NQAC Hamiltonian in the fullyantiferromagnetic K_(N×C) case can be interpreted as describing thecollective evolution of all logical qubits. The term

$\lambda \frac{1}{N}{\sum\limits_{i = 1}^{N}\left( S_{i}^{z} \right)^{2}}$

favors all the spins of each logical qubit (where “spin” refers to thequbit at t=t_(f)) being aligned, since this maximizes each summand.

A. Partition Function Calculation

One starts with the partition function

Z=Tr e ^(−βH) =Tr e ^(−βB(t)[H) ^(x) ^(+H) ^(z) ^(]) =Tr e ^(−θ[H) ^(x)^(+H) ^(z) ^(]),  (S.8)

Where θ=βB(t) is the dimensionless inverse temperature. One can writethe partition function explicitly as:

$\begin{matrix}{{Z = {{\sum\limits_{\{\sigma^{z}\}}{\langle{\left\{ \sigma^{z} \right\} {{\exp \left\lbrack {- {\theta \left( {H^{z} + H^{x}} \right)}} \right\rbrack}}\left\{ \sigma^{z} \right\}}\rangle}} = {\lim\limits_{M\rightarrow\infty}Z_{M}}}},} & \left( {S{.9}} \right)\end{matrix}$

where Σ_({σ) _(Z) _(}) is a sum over all possible 2^(CN) spinconfigurations in the z basis, and |{σ^(Z)})=

_(i=1) ^(N)

_(c=1) ^(C)|σ_(ic) ^(Z)

. Z_(M) is determined using the Trotter-Suzuki formulae^(A+B)=lim_(M→∞)(e^(A/M) e^(B/M))^(M):

$\begin{matrix}{Z_{M} = {\sum\limits_{\{\sigma^{z}\}}{{\langle{\left\{ \sigma^{z} \right\} {\left( {{\exp \left\lbrack {{- \frac{\theta}{M}}H^{z}} \right\rbrack}{\exp \left\lbrack {{- \frac{\theta}{M}}H^{x}} \right\rbrack}} \right)^{M}}\left\{ \sigma^{z} \right\}}\rangle}.}}} & \left( {S{.10}} \right)\end{matrix}$

After a lengthy calculation one can find

$\begin{matrix}{\mspace{700mu} {\left( {S{.11}} \right){Z \approx {\int{\prod_{j}{\; m_{j}\; {\overset{\sim}{m}}_{j}e^{{N{\lbrack{{\frac{1}{N}{\sum\limits_{j = 1}^{N}{\{{C\; {{\ln {\lbrack{2\; {\cosh {({{({\theta \; \Gamma})}^{2} - {({{\overset{\sim}{m}}_{j}/C})}^{2}})}}^{1/2}}\rbrack}} \cdot {+ {m_{j}{({{i{\overset{\sim}{m}}_{j}} + {\theta \; {JC}^{2}\lambda \; m_{j}}})}}}}}\}}}} - {\theta \; {JC}^{2}{\langle m\rangle}^{2}}}\rbrack}},}}}}}\mspace{20mu} {{{{where}\mspace{14mu} {\langle m\rangle}} \equiv {\frac{1}{N}{\sum\limits_{j = 1}^{N}m_{j}}}},}}} & \;\end{matrix}$

and where m_(j) is the Hubbard-Stratonovich field that represents S_(j)^(z)(α) after the static approximation N (i.e., dropping the adependence). The second Hubbard-Stratonovich field {tilde over (m)}_(j)acts as a Lagrange multiplier.

B. Free Energy

In the large β (low temperature) limit, the partition function isdominated by the global minimum. This minimum is given by

m

=0, which corresponds to either a paramagnetic phase (all m_(j)=0) or asymmetric phase (m_(j)=±m in equal numbers). It can be shown that thesystem undergoes a second order QPT, with the critical point moving tothe left as C and γ grow. Using a saddle point analysis of the partitionfunction one can show that {tilde over (m)}_(j)=±2iθC²Jλm, and hence,the dominant contribution to the partition function is given by:

$\begin{matrix}{Z_{C} \approx e^{N{\{{{C\; {\ln {\lbrack{2\; {\cosh {({{({\theta \; \Gamma})}^{2} + {({2\; \theta \; {JC}\; \lambda \; m})}^{2}})}}^{1/2}}\rbrack}}} - {\theta \; {JC}^{2}\lambda \; m^{2}}}\}}}} & {{~~~~~~~~~~~~}\left( {S{.12}a} \right)} \\{= {e^{N{\{{{C\; {\ln {\lbrack{2\; {\cosh {({{\lbrack{\beta \; {A{(t)}}}\rbrack}^{2} + {\lbrack{2\; \beta \; {B{(t)}}{C{({\gamma + \frac{J}{N}})}}m}\rbrack}^{2}})}}^{1/2}}\rbrack}}} - {\beta \; {B{(t)}}{C^{2}{({\gamma + \frac{J}{N}})}}m^{2}}}\}}}.}} & {\left( {S{.12}b} \right)}\end{matrix}$

For B(t)>0 and in the low temperature limit (θ>>1) one can approximate 2cos h(θ|x|) as e^(θ|x|),

$\begin{matrix}{Z_{C} \approx e^{N\; \theta {\{{{({{({C\; \Gamma})}^{2} + {({2\; J\; \lambda \; C^{2}m})}^{2}})}^{1/2} - {J\; \lambda \; C^{2}m^{2}}}\}}}} & {{\; \;}\left( {S{.13}a} \right)} \\{= e^{N\; \beta {\{{{({{\lbrack{{CA}{(t)}}\rbrack}^{2} + {\lbrack{2{B{(t)}}{({\gamma + \frac{J}{N}})}C^{2}m}\rbrack}^{2}})}^{1/2} - {{B{(t)}}{({\gamma + \frac{J}{N}})}C^{2}m^{2}}}\}}}} & {\left( {S{.13}b} \right)} \\{{= e^{{- \beta}\; {NF}}},} & \end{matrix}$

where in the second line reintroducing the physical inverse temperatureβ [recall Eq. (S.8)]. Factoring out C² and taking the large N limit thendirectly yields the free energy density expression Eq. (5).

III. Additional Numerical Data

FIGS. 9A and 9B show saturation removal for NQAC applied toantiferromagnetic K₄. FIG. 9A shows SQA results. As one increases thenumber of sweeps, the flattening of μ_(C) is slowly removed. FIG. 9Bshows parallel tempering (infinite sweep number) results. A thermaldistribution on the ME fully recovers the no-ME scaling.

In particular, FIG. 9A shows that the saturation of μ_(C) at large C isremoved when the number of sweeps is increased. The thermal state, wherethe system has fully thermalized, can be understood as the limit of aninfinite number of sweeps. FIG. 9B shows that the saturation is fullyremoved for the thermal state (generated using parallel tempering), andnesting is then equivalent to an energy (or temperature) boost close tothe ideal result μ_(max) ^(C)=C². This suggests that for a sufficientlylarge sweep number, performance can be brought to near the ideal result.

FIGS. 10A-10C show parallel tempering (PT) results for antiferromagneticK4 with no noise on the couplers. PT was used to generate a thermalstate with respect to the ME, which was then decoded using majorityvoting. FIG. 10A shows success probabilities for different nestingdegrees C. at β=2. FIG. 10B shows success probabilities for differentinverse temperatures at C=4. FIG. 10C shows scaling of μ_(C) for thethermal state. The solid lines represent the best linear fit to all thedata points. All the best-fit lines have slopes greater than 0.95, soone can find that the optimal scaling of μ_(C)˜C² is recovered at all(sufficiently large) inverse temperatures tested. This illustrates thatfor a sufficiently cold equilibrated system ME does not result in asuboptimal energy boost.

FIGS. 10A-10C give further evidence that nesting can be interpreted asan effective reduction of temperature by studying the successprobability associated with the thermal distribution on the ME. Paralleltempering (PT) was used to sample from the thermal state associated withthe ME of the different NQAC cases shown in FIGS. 4A-4C, and decodedusing majority voting. One can find that the thermal state at differenttemperatures but fixed C, exhibits the same qualitative behavior as thethermal state at fixed temperature but different C [see FIG. 10A versusFIG. 10B]. Therefore, the performance improvement associated withreducing the temperature can also be reproduced by increasing C. Thisenforces that the energy boost can also be interpreted as decrease ofthe effective temperature of the device. One can also find that thethermal state exhibits an energy boost scaling of μ_(C)˜C² [see FIG.10C].

IV. Choi And Heuristic Embedding

The “Chimera” hardware connectivity graph of the D-Wave devices allowsfor a ME of complete graphs. Above, this is identified as “Choi minorembedding”.

FIGS. 11A and 11B show MEs of a K₃₂. These were used, e.g., tominor-embed a C=8 nesting of a K₄, or a C=4 nesting of a K₈. FIG. 11Ashows the Choi embedding implemented on a perfect Chimera graph. FIG.11B shows a heuristic ME for the actual D-WAVE 2000Q device used in thiswork, whose Chimera graph contains 8 unusable qubits (circles).Different colors (and labels) denote chains representing minor-embeddedlogical qubits. Thin lines are logical couplings, while thick linesrepresent energy penalties (ferromagnetic couplings).

The Choi technique requires a perfect Chimera graph, without missingvertices. In actual devices, however, imperfections in fabrication orthe calibration process lead to the presence of unusable qubits (e.g.,due to trapped flux). These qubits, along with their couplings are thenpermanently disabled and cannot be used in the QA process. Efficientheuristic algorithms have been developed to search for MEs for theresulting induced Chimera subgraphs. FIG. 10B shows the ME of a K32obtained when a heuristic algorithm is applied to the actual hardwaregraph of the D-WAVE 2000Q “Vesuvius” chip installed at USC-ISI. Note howthe ME avoids the unusable qubits, depicted as black circles in FIG.10B.

The MEs shown in FIGS. 11A and 11B are the actual “Choi” and “heuristic”MEs used in the experiments and simulations. As discussed above, SQAsimulations demonstrate that the choice of the ME has a significantimpact on the performance of NQAC. In particular, it turns out that theChoi ME outperforms the heuristic ME.

V. Additional Experimental Data

Here is presented additional experimental data for K_(N)'s withcouplings randomly generated from the set J_(ij)ε{0.1, 0.2, . . . , 0.9,1}. For large N, K_(N) generated in this manner have a finitetemperature spin glass phase transition. This property renders simulatedannealing inefficient in finding the ground state of such problems. Thepreviously discussed results report data for a random K₈ instance thatis referred to here as “harder-K₈”:

$\begin{matrix}{K_{8}^{h} = {\begin{pmatrix}0 & 0.4 & 0.7 & 0.5 & 0.3 & 0.5 & 0.2 & 0.5 \\0 & 0 & 0.3 & 0.8 & 0.8 & 0.3 & 0.5 & 0.7 \\0 & 0 & 0 & 0.5 & 0.9 & 0.9 & 0.3 & 0.9 \\0 & 0 & 0 & 0 & 1 & 0.8 & 0.8 & 0.7 \\0 & 0 & 0 & 0 & 0 & 0.9 & 0.3 & 0.6 \\0 & 0 & 0 & 0 & 0 & 0 & 0.9 & 0.4 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\end{pmatrix}.}} & \left( {S{.14}} \right)\end{matrix}$

Data was also collected for another random K₈ instance that turned outto have a higher success probability, so one can refer to it as“easy-K₈”:

$\begin{matrix}{K_{8}^{e} = {\begin{pmatrix}0 & 0.8 & 0.7 & 0.8 & 0.9 & 0.4 & 0.2 & 0.9 \\0 & 0 & 0.7 & 0.8 & 0.3 & 0.7 & 1 & 0.3 \\0 & 0 & 0 & 0.7 & 0.6 & 0.1 & 0.5 & 0.6 \\0 & 0 & 0 & 0 & 0.1 & 0.8 & 0.1 & 0.5 \\0 & 0 & 0 & 0 & 0 & 0.5 & 0.8 & 0.2 \\0 & 0 & 0 & 0 & 0 & 0 & 0.6 & 0.7 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & {1\;} \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\end{pmatrix}.}} & \left( {S{.15}} \right)\end{matrix}$

The results of these two instances are shown in FIGS. 12A-12D. Inparticular, FIGS. 12A and 12B show P_(C)(α) and adjusted P_(C)′(α) forthe hard-K₈ instance. FIGS. 12C and 12D show P_(C)(α) and adjustedP_(C)′(α) for the easy-K₈ instance.

Data was also collected for a “easy-K₁₀” instance and an “hard-K₁₀”instance:

$\begin{matrix}{{K_{10}^{e} = \begin{pmatrix}0 & 0.2 & 0.7 & 0.8 & 0.5 & 0.3 & 0.8 & 0.9 & 0.4 & 0.1 \\0 & 0 & 0.1 & 0.1 & 0.4 & 0.7 & 0.3 & 0.3 & 0.9 & 0.1 \\0 & 0 & 0 & 0.3 & 0.8 & 0.7 & 0.6 & 0.9 & 0.6 & 0.6 \\0 & 0 & 0 & 0 & 0.8 & 0.2 & 0.7 & 0.3 & 0.6 & 0.8 \\0 & 0 & 0 & 0 & 0 & 0.2 & 0.9 & 1 & 1 & 1 \\0 & 0 & 0 & 0 & 0 & 0 & 1 & 0.4 & 0.3 & 0.2 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.2 & 0.8 & 0.6 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.8 & 0.5 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.1 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\end{pmatrix}},} & \left( {S{.16}} \right) \\{K_{10}^{h} = {\begin{pmatrix}0 & 0.6 & 0.9 & 0.8 & 0.5 & 1 & 0.4 & 0.2 & 0.1 & 0.5 \\0 & 0 & 0.8 & 0.9 & 0.1 & 0.6 & 0.2 & 0.7 & 0.7 & 0.9 \\0 & 0 & 0 & 0.8 & 0.6 & 0.3 & 0.8 & 0.2 & 0.6 & 0.6 \\0 & 0 & 0 & 0 & 0.1 & 0.3 & 0.8 & 0.4 & 0.6 & 0.5 \\0 & 0 & 0 & 0 & 0 & 0.7 & 0.6 & 0.4 & 0.3 & 0.1 \\0 & 0 & 0 & 0 & 0 & 0 & 0.1 & 1 & 0.9 & 0.6 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.9 & 0.9 & 0.9 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.1 & 1.0 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.3 \\0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\end{pmatrix}.}} & \;\end{matrix}$

The results of these two instances are shown in FIGS. 13A-13D. Inparticular, FIGS. 13A and 13B show P_(C)(α) and adjusted P_(C)′(α) forthe hard-K₁₀ instance. FIGS. 13C and 13D show P_(C)(α) and adjustedP_(C)′(α) for the easy-K₁₀ instance.

In all cases, results are displayed up to nesting degree C=3.

FIGS. 14A-14D shows the optimal penalty strength as a function of theenergy scale for the four instances considered. In particular, FIG. 14Ashows optimal penalty values (γ) for the hard-K₈ instance and FIG. 14Bshows the values for the easy-K₈ instance. FIG. 14C shows values for thehard-K₁₀ instance and FIG. 14D shows the values for the easy-K₁₀instance. A saturation of the optimal penalty is visible at the maximalpossible value |γ|=1 for α close to 1, implying that the true optimalpenalty values are >1 in this range.

FIGS. 15A-15D show data collapse and μ_(C) scaling results for theantiferromagnetic hard-K₈ problem considered above, as well as theeasy-K₈ problem. In particular, FIGS. 15A and 15B, respectively, showdata collapse and μ_(C) scaling results for the hard-K₈ problem. FIGS.15C and 15D, respectively, show data collapse and μ_(C) scaling resultsfor the easy-K₈ problem. As shown in FIGS. 15A and 1C, there is a datacollapse to the left of the peak. Recall that the peak is due to havingreached the maximum penalty value, as illustrated in FIGS. 14A-14D. Theassociated scaling of the energy boost μ_(C), as shown in FIGS. 15B and15D, yields μ_(C)˜C^(1.32) for the hard-K₈ instance and μ_(C)˜C^(1.26)for the easy-K₈ instance.

FIGS. 16A-16D show data collapse and μ_(C) scaling results for theantiferromagnetic hard-K₁₀ problem considered above, as well as theeasy-K₁₀ problem. In particular, FIGS. 16A and 16B, respectively, showdata collapse and μ_(C) scaling results for the hard-K₁₀ problem. FIGS.16C and 16D, respectively, show data collapse and μ_(C) scaling resultsfor the easy-K₁₀ problem. For both of these instances, μ_(C)˜C^(1.34).

VI. Determination of μ_(C)

To determine the values of μ_(C) and estimate error bars, one proceedsas follows. First, one can use smoothing splines to determine acontinuous interpolation P_(C) ^(mid)(α) of the discrete data pointsP_(C) (α). In the same way one can also determine the higher and lowerinterpolating curves P_(C) ^(high)(α) and P_(C) ^(low)(α) for the datapoints P_(C)(α)+δP_(C)(α) and P_(C)(α)−δP_(C)(α) respectively, whereδP_(C)(α) denotes the standard error of P_(C)(α). A reference valueα_(C) ^(mid) was then determined such that P_(C) ^(mid)(α_(C)^(mid))=P₀, where a smooth interpolation of the experimental data isused. The energy boost was then determined as μ_(C)=α₁ ^(mid)/α_(C)^(mid). P₀ is an arbitrarily chosen reference value where the differentP_(C)(α) curves are overlapped. This reference serves as a base pointfor computing μ_(C). As shown in the main text for the K₄, the overlapof the P_(C) data over the entire a range means that the specific choiceof P₀ is arbitrary.

One can similarly determine μ_(C) ^(high)=α₁ ^(high)/α_(C) ^(high) andμ_(C) ^(low)=α₁ ^(low)/α_(C) ^(low) using the correspondinginterpolating curves. The error bars shown in the figures were thencentered at μ_(C), with lower and upper error bars being μ_(C) ^(high)and μ_(C) ^(low), respectively.

VII. Numerical Methods

Previously, results were reported based on quantum Monte Carlotechniques described above. Here this technique is briefly reviewed.Simulated Quantum Annealing (SQA) is a quantum Monte Carlo basedalgorithm whereby Monte Carlo dynamics are used to sample from theinstantaneous Gibbs state associated with the Hamiltonian H(t) of thesystem. The state at the end of the quantum Monte Carlo simulation ofthe quantum Hamiltonian H(t) is used as the initial state for the nextMonte Carlo simulation with Hamiltonian H(t+Δt). This is repeated untilH(t_(f)) is reached. SQA was originally proposed as an optimizationalgorithm [13, 14], but it has since gained traction as acomputationally efficient classical description for T>0 quantumannealers. An important caveat is that SQA does not capture the unitarydynamics of the quantum system, but it is hoped that the sampling of theinstantaneous Gibbs state captures thermal processes in the quantumannealer, which may be the dominant dynamics if the evolution issufficiently slow. Although there is strong evidence that SQA does notcompletely capture the final-time output of the D-Wave processors, atpresent it is the only viable means to simulate large (>15 qubits) openQA systems. Discrete-time quantum Monte Carlo was used in thesesimulations with the number of Trotter slices fixed to 64. Spin updateswere performed via Wolff-cluster updates along the Trotter directiononly.

What is claimed is: 1) A method of processing using a quantum processor,the method comprising: obtaining a problem Hamiltonian; defining anested Hamiltonian with a plurality of logical qubits by embedding alogical K_(N) representing the problem Hamiltonian into a largerK_(C×N), where N represents a number of the logical qubits and Crepresents a nesting level defining the amount of hardware resources forthe nest Hamiltonian; encoding the nested Hamiltonian into the pluralityof physical qubits of the quantum processor; performing a quantumannealing process with the quantum processor after the encoding. 2) Themethod of claim 1, further comprising: measuring the plurality ofphysical qubits; and recovering a logical state of each of the pluralityqubits using a decoding procedure. 3) The method of claim 2, wherein theencoding further comprises performing a minor embedding processcomprising replacing each of plurality of logical qubits in the nestedHamiltonian by a ferromagnetically coupled chain of qubits, such thatall couplings in the nested Hamiltonian are represented by inter-chaincouplings. 4) The method of claim 3, wherein a number of physical qubitsnecessary for the minor embedding of the K_(C×N) is N_(C,Phys)=CNL˜C²N².5) The method of claim 2, wherein the decoding procedure is performedover both a length (L) chain of each encoded qubit and C encoded qubitscomprising each logical qubit. 6) The method of claim 1, wherein thehardware resources comprise at least one of physical qubits, couplers,and local fields. 7) The method of claim 1, wherein each logical cubit i(i=1, . . . , N) is represented by a C-tuple of encoded qubits (i, c),with c=1, . . . , C. 8) The method of claim 7, wherein the hardwareresources comprise nested couplers {tilde over (J)}_((i,c),(j,c′)) andlocal fields {tilde over (h)}_((i,c)) where{tilde over (J)} _((i,c),(j,c′)) =J _(ij) , ∀c,c′, i≠j,{tilde over (h)} _((i,c)) =Ch _(i) , ∀c,i,{tilde over (J)} _((i,c),(j,c′)) =−γ, ∀c≠c′. 9) A processing system,comprising: A digital computer comprising a digital processor and amemory having stored thereon instructions for causing the digitalprocessor to: obtain a problem Hamiltonian, define a nested Hamiltonianwith a plurality of logical qubits by embedding a logical K_(N)representing the problem Hamiltonian into a larger K_(C×N), where Nrepresents a number of the logical qubits and C represents a nestinglevel defining the amount of hardware resources for the nestedHamiltonian; and an analog computer coupled to the digital computer, theanalog computer comprising a quantum processor and configured for:encoding the nested Hamiltonian into a plurality of physical qubits ofthe quantum processor, and performing a quantum annealing process withthe quantum processor after the encoding. 10) The system of claim 9,wherein the analog computer is configured for measuring the plurality ofphysical qubits, and wherein the instructions further compriseinstructions for causing the digital processor to recover a logicalstate of each of the plurality qubits using a decoding procedure. 11)The system of claim 10, wherein the encoding further comprisesperforming a minor embedding process comprising replacing each ofplurality of logical qubits in the nested Hamiltonian by aferromagnetically coupled chain of qubits, such that all couplings inthe nested Hamiltonian are represented by inter-chain couplings. 12) Thesystem of claim 11, wherein a number of physical qubits necessary forthe minor embedding of the K_(C×N) is N_(C,Phys)=CNL˜C²N². 13) Thesystem of claim 10, wherein the decoding procedure is performed overboth a length (L) chain of each encoded qubit and C encoded qubitscomprising each logical qubit. 14) The system of claim 9, wherein thehardware resources comprise at least one of physical qubits, couplers,and local fields. 15) The system of claim 9, wherein each logical cubiti (i=1, . . . , N) is represented by a C-tuple of encoded qubits (i, c),with c=1, . . . , C. 16) The system of claim 9, wherein the hardwareresources comprise nested couplers {acute over (J)}_((i,c),(j,c′)) andlocal fields {tilde over (h)}_((i,c)) where{tilde over (J)} _((i,c),(j,c′)) =J _(ij) , ∀c,c′, i≠j,{tilde over (h)} _((i,c)) =Ch _(i) , ∀c,i,{tilde over (J)} _((i,c),(j,c′)) =−γ, ∀c≠c′.